Skip to main content

View Post [edit]

Poster: LucasMation Date: Mar 31, 2016 6:20am
Forum: web Subject: how to query for all the websites that end in ".com.br"?

Is there a way to query the wayback machine for a list of websites (or all website-date pairs) ending in ".com.br"?

Reply [edit]

Poster: pegzmasta Date: Apr 1, 2016 10:13am
Forum: web Subject: Re: how to query for all the websites that end in '.com.br'?

> Is there a way to query the wayback machine for a list of websites (or all website-date pairs) ending in ".com.br"?

The FAQ declares that a full text search engine may be implemented in the future; currently, it is not possible to do this, yet. Right now, what you ask is only possible with Archive-It. Here is an example:

Search for: ".com.br" on Archive-It

This post was modified by pegzmasta on 2016-04-01 17:13:36

Reply [edit]

Poster: LucasMation Date: Apr 1, 2016 12:03pm
Forum: web Subject: Re: how to query for all the websites that end in '.com.br'?

OP here. I managed to make some progress using the wayback-cdx-server API. (https://github.com/internetarchive/wayback/tree/master/wayback-cdx-server)

The following are for *.com.br, and *.gov.br

http://web.archive.org/cdx/search/cdx?url=com.br/&;matchType=domain
http://web.archive.org/cdx/search/cdx?url=gov.br/&;matchType=domain

You can also limit the number of items returned:
http://web.archive.org/cdx/search/cdx?url=*.com.br/&;matchType=domain&limit=1000

Reply [edit]

Poster: pegzmasta Date: Apr 1, 2016 12:19pm
Forum: web Subject: Re: how to query for all the websites that end in '.com.br'?

Now, THIS is interesting!

I didn't think that you would have the patience to explore the API route. Extra kudos to you for researching this! The resource on GitHub is definitely worth listing. You can only do so much using the usual web interface that everyone is currently familiar with, but this API grants so much more control and functionality.

Reply [edit]

Poster: sahil7459 Date: May 25, 2017 5:00am
Forum: web Subject: Re: how to query for all the websites that end in '.com.br'?

does that url work for you, its not working at this end